<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>13</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Antigoni-Maria Founta</style></author><author><style face="normal" font="default" size="100%">Constantinos Djouvas</style></author><author><style face="normal" font="default" size="100%">Despoina Chatzakou</style></author><author><style face="normal" font="default" size="100%">Ilias Leontiadis</style></author><author><style face="normal" font="default" size="100%">Jeremy Blackburn</style></author><author><style face="normal" font="default" size="100%">Gianluca Stringhini</style></author><author><style face="normal" font="default" size="100%">Athena Vakali</style></author><author><style face="normal" font="default" size="100%">Michael Sirivianos</style></author><author><style face="normal" font="default" size="100%">Nicolas Kourtellis</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior</style></title><tertiary-title><style face="normal" font="default" size="100%">ICWSM-18</style></tertiary-title></titles><dates><year><style  face="normal" font="default" size="100%">2018</style></year></dates><publisher><style face="normal" font="default" size="100%">AAAI</style></publisher><pub-location><style face="normal" font="default" size="100%">Stanford, California</style></pub-location><abstract><style face="normal" font="default" size="100%">&lt;p&gt;In recent years, offensive, abusive and hateful language, sexism, racism and other types of aggressive and cyberbullying behavior have been manifesting with increased frequency, and in many online social media platforms. In fact, past scientific work focused on studying these forms in popular media, such as Facebook and Twitter. Building on such work, we present an 8-month study of the various forms of abusive behavior on Twitter, in a holistic fashion. Departing from past work, we examine a wide variety of labeling schemes, which cover different forms of abusive behavior, at the same time. We propose an incremental and iterative methodology, that utilizes the power of crowdsourcing to annotate a large scale collection of tweets with a set of abuse-related labels. In fact, by applying our methodology including statistical analysis for label merging or elimination, we identify a reduced but robust set of labels. Finally, we offer a first overview and findings of our collected and annotated dataset of 100 thousand tweets, which we make publicly available for further scientific exploration.&lt;/p&gt;
</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>10</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Joan Serrà</style></author><author><style face="normal" font="default" size="100%">Ilias Leontiadis</style></author><author><style face="normal" font="default" size="100%">Dimitris Spathis</style></author><author><style face="normal" font="default" size="100%">Gianluca Stringhini</style></author><author><style face="normal" font="default" size="100%">Jeremy Blackburn</style></author><author><style face="normal" font="default" size="100%">Athena Vakali</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Class-based Prediction Errors to Categorize Text with Out-of-vocabulary Words</style></title><tertiary-title><style face="normal" font="default" size="100%">ALW1'17</style></tertiary-title></titles><dates><year><style  face="normal" font="default" size="100%">2017</style></year></dates><pub-location><style face="normal" font="default" size="100%">Vancouver, Canada</style></pub-location><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;Common approaches to text categorization essentially rely either on n-gram counts or on word embeddings. This presents important difficulties in highly dynamic or quickly-interacting environments, where the appearance of new words and/or varied misspellings is the norm. A paradigmatic example of this situation is abusive online behavior, with social networks and media platforms struggling to effectively combat uncommon or non-blacklisted hate words. To better deal with these issues in those fast-paced environments, we propose using the error signal of class-based language models as input to text classification algorithms. In particular, we train a next-character prediction model for any given class, and then exploit the error of such class-based models to inform a neural network classifier. This way, we shift from the ability to describe seen documents to the ability to predict unseen content. Preliminary studies using out-of-vocabulary splits from abusive tweet data show promising results, outperforming competitive text categorization strategies by 4–11%.&lt;/p&gt;
</style></abstract></record></records></xml>